Skip to content

Fix slot-based collator panic during warp sync (#11072)#11381

Merged
bkchr merged 7 commits intoparitytech:masterfrom
clangenb:cl/wait-for-aura
Apr 2, 2026
Merged

Fix slot-based collator panic during warp sync (#11072)#11381
bkchr merged 7 commits intoparitytech:masterfrom
clangenb:cl/wait-for-aura

Conversation

@clangenb
Copy link
Copy Markdown
Contributor

@clangenb clangenb commented Mar 16, 2026

When a parachain collator starts with --authoring=slot-based and performs warp sync, the slot-based-block-builder essential task immediately calls slot_duration() which requires AuraApi_slot_duration. During warp sync the runtime isn't ready, so this fails and the task returns, shutting down the node.

The lookahead collator avoids this by calling wait_for_aura() before starting. This PR adds an equivalent guard to the slot-based collator.

Manual test

Before the fix the collator panicked after the relay chain warp sync with AuraApi_slot_duration not available, which does not occur anymore now.

 ./target/release/polkadot-parachain \                                                                                                                                                                                                                                                                          
    --chain asset-hub-polkadot \
    --sync warp \
    --authoring=slot-based \
    --tmp -- --sync warp

Closes #11072.

  The slot-based block builder task crashes during warp sync because it
  immediately calls AuraApi_slot_duration before the runtime is available.
  Since it runs as an essential task, this kills the entire node.

  Wait for the Aura runtime API to become available before entering the
  main loop, matching the pattern used by the lookahead collator. Also
  convert the fatal  on slot_duration failure to a retryable
   so transient runtime API errors don't kill the task.

  Closes paritytech#11072
Comment thread cumulus/client/consensus/aura/src/collators/slot_based/block_builder_task.rs Outdated
@clangenb
Copy link
Copy Markdown
Contributor Author

/cmd prdoc --audience node_dev --bump patch

@clangenb
Copy link
Copy Markdown
Contributor Author

/cmd label T11-client

@paritytech-cmd-bot-polkadot-sdk paritytech-cmd-bot-polkadot-sdk Bot added the T11-documentation This PR/Issue is related to documentation. label Mar 16, 2026
@clangenb
Copy link
Copy Markdown
Contributor Author

/cmd label T9-cumulus

@paritytech-cmd-bot-polkadot-sdk paritytech-cmd-bot-polkadot-sdk Bot added the T9-cumulus This PR/Issue is related to cumulus. label Mar 16, 2026
Copy link
Copy Markdown
Member

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move the same code into the omni node, as it is done for the lookahead collator.

  Address review feedback: instead of polling for the Aura runtime API
  inside block_builder_task, use the existing wait_for_aura helper at the
  omni node level, matching the lookahead collator pattern.
@clangenb
Copy link
Copy Markdown
Contributor Author

Thanks you are right, makes much more sense, then we can also re-use the existing wait for aura.

@clangenb
Copy link
Copy Markdown
Contributor Author

/cmd prdoc --audience node_dev --bump patch

@github-actions
Copy link
Copy Markdown
Contributor

Command "prdoc --audience node_dev --bump patch" has failed ❌! See logs here

@clangenb
Copy link
Copy Markdown
Contributor Author

/cmd prdoc --audience node_dev --bump patch --force

@clangenb
Copy link
Copy Markdown
Contributor Author

/cmd prdoc --audience node_dev --bump patch --force

@clangenb
Copy link
Copy Markdown
Contributor Author

@bkchr may I ask you to take a look at it again? :)

Copy link
Copy Markdown
Contributor

@skunert skunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

@bkchr bkchr enabled auto-merge April 2, 2026 20:54
@bkchr bkchr added this pull request to the merge queue Apr 2, 2026
Merged via the queue into paritytech:master with commit a1a2bbf Apr 2, 2026
224 of 229 checks passed
lexnv added a commit that referenced this pull request Apr 3, 2026
Squashed commit of the following:

commit e9ab097
Author: eskimor <robert@gonimo.com>
Date:   Fri Apr 3 00:53:19 2026 +0200

    Pr doc

commit ef9dfac
Author: eskimor <robert@gonimo.com>
Date:   Fri Apr 3 00:41:08 2026 +0200

    Fix warning

commit 076e7e9
Author: eskimor <robert@gonimo.com>
Date:   Fri Apr 3 00:09:28 2026 +0200

    Tests

commit cdfc57c
Author: eskimor <robert@gonimo.com>
Date:   Fri Apr 3 00:09:03 2026 +0200

    Fix test

commit a1a2bbf
Author: clangenb <37865735+clangenb@users.noreply.github.com>
Date:   Thu Apr 2 22:55:34 2026 +0200

    Fix slot-based collator panic during warp sync (#11072) (#11381)

    When a parachain collator starts with `--authoring=slot-based` and
    performs warp sync, the `slot-based-block-builder` essential task
    immediately calls `slot_duration()` which requires
    `AuraApi_slot_duration`. During warp sync the runtime isn't ready, so
    this fails and the task returns, shutting down the node.

    The lookahead collator avoids this by calling `wait_for_aura()` before
    starting. This PR adds an equivalent guard to the slot-based collator.

    ### Manual test
    Before the fix the collator panicked after the relay chain warp sync
    with AuraApi_slot_duration not available, which does not occur anymore
    now.
    ```
     ./target/release/polkadot-parachain \
        --chain asset-hub-polkadot \
        --sync warp \
        --authoring=slot-based \
        --tmp -- --sync warp
    ```
    Closes #11072.

    ---------

    Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    Co-authored-by: clangenb <clangenb@users.noreply.github.com>

commit 802db0b
Author: 0xRVE <robertvaneerdewijk@gmail.com>
Date:   Thu Apr 2 16:39:00 2026 +0300

    [pallet-revive] Add vesting precompile (#11398)

    ## Summary

    Adds a new built-in precompile (`pallet-revive-precompile-vesting`) that
    exposes Substrate's `pallet-vesting` to EVM contracts via pallet-revive.
    EVM contracts can call `vest()`, `vestOther(address)`,
    `vestingBalance()`, and `vestingBalanceOf(address)` at the precompile
    address `0x0902`.

    ## Changes

    - **`substrate/frame/revive/uapi/sol/IVesting.sol`**: New Solidity
    interface defining the vesting precompile ABI
    - **`substrate/frame/revive/uapi/src/precompiles/vesting.rs`**: Binds
    the Solidity interface via `alloy_core::sol!`
    - **`substrate/frame/revive/precompiles/`**: New crate implementing the
    `Precompile` trait — dispatches `vest`/`vestOther` through
    `pallet_vesting` and queries locked balances via `VestingSchedule`
    - **`substrate/frame/revive/src/tests.rs`**: Trailing comma fix in
    `construct_runtime!`

    ## Test plan
    - [x] New vesting precompile tests pass (`vest`, `vestOther`,
    `vestingBalance`, `vestingBalanceOf`)
    - [x] Existing pallet-revive tests unaffected

    ---------

    Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    Co-authored-by: PG Herveou <pgherveou@gmail.com>

commit 5fbeadd
Author: Dhiraj Sah <dhiraj@parity.io>
Date:   Thu Apr 2 17:35:52 2026 +0530

    fix(pallet-multi-asset-bounties): use non-destructive read in calculate_payout() (#11425)

    ## Description

    `calculate_payout()` in `pallet-multi-asset-bounties` uses
    `ChildBountiesValuePerParent::take()` — a destructive read that deletes
    the storage entry — instead of `get()`. Since `calculate_payout()` is
    called from multiple code paths, the `take()` causes incorrect behavior
    on subsequent calls.

    `calculate_payout()` is called from two places:

    1. `do_process_payout_payment()` (lib.rs:1736) — invoked by
    `award_bounty()` and `retry_payment()`
    2. `do_check_payout_payment_status()` (lib.rs:1771) — invoked by
    `check_status()` on payment success

    When a parent bounty with child bounties is awarded:

    - The **first call** (from `award_bounty()`) reads
    `ChildBountiesValuePerParent` via `take()`, correctly computing
    `parent_value - children_value`, but **deletes the storage entry** as a
    side effect.
    - The **second call** (from `check_status()` on success) reads the
    now-deleted storage, gets `0`, and emits `BountyPayoutProcessed` with
    `value: parent_value` instead of the correct `value: parent_value -
    children_value`.

    Additionally, if a non-synchronous `Paymaster` implementation is used
    where `check_payment()` can return `Failure`, the `retry_payment()` path
    would call `calculate_payout()` again on the deleted storage, attempting
    to pay the full parent value instead of the reduced amount.

    ## Integration

    No integration changes required for downstream projects. This is a
    bugfix internal to `pallet-multi-asset-bounties` with no changes to
    public APIs, storage layout, or trait definitions.

    ## Review Notes

    Three changes were made:

    ### 1. `calculate_payout()` — `take()` replaced with `get()`

    ```diff
    - let children_value = ChildBountiesValuePerParent::<T, I>::take(parent_bounty_id);
    + let children_value = ChildBountiesValuePerParent::<T, I>::get(parent_bounty_id);
    ```

    This makes `calculate_payout()` idempotent — safe to call multiple times
    for the same bounty.

    ### 2. `remove_bounty()` — explicit storage cleanup added

    ```diff
      None => {
          Bounties::<T, I>::remove(parent_bounty_id);
          ChildBountiesPerParent::<T, I>::remove(parent_bounty_id);
          TotalChildBountiesPerParent::<T, I>::remove(parent_bounty_id);
    -     debug_assert!(ChildBountiesValuePerParent::<T, I>::get(parent_bounty_id).is_zero());
    +     ChildBountiesValuePerParent::<T, I>::remove(parent_bounty_id);
      },
    ```

    The `debug_assert!` was removed because it was not a true invariant — it
    only passed because `take()` had already deleted the value. When child
    bounties are paid out, `ChildBountiesValuePerParent` remains non-zero
    until parent bounty cleanup.

    ### 3. Test updated

    Added an event assertion to the existing `check_status_works` test to
    verify `BountyPayoutProcessed` emits the correct net payout value
    (`parent_value - child_value`) instead of the full parent value. This
    assertion fails with `take()` and passes with `get()`.

    ### Impact

    - **Current deployments (KAH, PAH)**: Both use `LocalPay` where
    `check_payment()` always returns `Success`. The retry/lock path is
    unreachable, but the `BountyPayoutProcessed` event emits an incorrect
    payout value for parent bounties with child bounties.
    - **Future deployments**: If the pallet is configured with an async
    `Paymaster` (e.g., XCM-based) where `check_payment()` can return
    `Failure`, the `retry_payment()` path would compute a wrong payout
    amount, potentially leading to permanent fund lock with no recovery path
    (since `close_bounty()` rejects `PayoutAttempted` status).

    ---------

    Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>

commit e3865cf
Author: Rodrigo Quelhas <22591718+RomarQ@users.noreply.github.com>
Date:   Thu Apr 2 10:41:38 2026 +0100

    Deprecate `ValidateUnsigned` trait and `#[pallet::validate_unsigned]` attribute (#10150)

    Part of #2415
    Closes #2436

    Related: #6325 #6326

    ## Summary

    Deprecates the `ValidateUnsigned` trait and
    `#[pallet::validate_unsigned]` attribute in favor of the new
    `TransactionExtension` API. This is a non-breaking change that adds
    deprecation warnings to guide users toward the modern transaction
    validation approach.

    ## Motivation

    The `ValidateUnsigned` trait was the legacy approach for validating
    unsigned transactions in FRAME pallets. The newer `TransactionExtension`
    trait provides a more flexible and composable way to handle transaction
    validation, including both signed and unsigned transactions.

    ## Changes

    ### Deprecated APIs
    - ✅ Added `#[deprecated]` attribute to `ValidateUnsigned` trait
    - ✅ Added deprecation warning to `#[pallet::validate_unsigned]` macro
    attribute

    ### Migration (Using `TransactionExtensions`)

    https://paritytech.github.io/polkadot-sdk/master/polkadot_sdk_docs/reference_docs/transaction_extensions

    ## Impact

    - **Non-breaking:** Existing code continues to work with deprecation
    warnings
    - **Compiler warnings:** Users will see deprecation notices guiding them
    to migrate
    - **Timeline:** Full removal planned for a future major release (TBD)

    ## Review Notes

    - The `#[pallet::validate_unsigned]` deprecation warning might be
    redundant since it's always used together with `ValidateUnsigned`, but
    both are included for completeness and clarity.

    ## Follow-up Tasks

    The following pallets and crates need to be migrated to
    `TransactionExtension` in subsequent PRs:

    **Runtime crates:**
    - [ ] `polkadot-runtime-common`
    - [ ] `polkadot-runtime-parachains`

    **FRAME pallets:**
    - [ ] `pallet-babe`
    - [ ] `pallet-beefy`
    - [ ] `pallet-election-provider-multi-block`
    - [ ] `pallet-grandpa`
    - [x] `pallet-im-online`
    #11235
    - [x] `pallet-mixnet`
    #11010

    **Core:**
    - [ ] `frame-executive`
    - [ ] `frame-system`

    **Examples:**
    - [x] `pallet-example-offchain-worker`
    #10716

    **Testing:**
    - [ ] `substrate-test-runtime`

    ## Open Question

    Should we remove the `ValidateUnsigned` bound from the type parameter
    `V` in the `Applyable` trait?

    ---------

    Co-authored-by: Guillaume Thiolliere <guillaume.thiolliere@parity.io>
    Co-authored-by: Shawn Tabrizi <shawntabrizi@gmail.com>
    Co-authored-by: Branislav Kontur <bkontur@gmail.com>

commit 5c97c0f
Author: clangenb <37865735+clangenb@users.noreply.github.com>
Date:   Thu Apr 2 01:06:59 2026 +0200

    [Penpal] fix genesis presets - assign proper ED to accounts (#11575)

    Penpal had values below the ED for initializing asset balances for some
    accounts. This has not been detected as no unit tests actually use the
    presets. This PR fixes the invalid values, and it also adds some unit
    tests for validating that the presets build at least.

    Closes #11558.

    ---------

    Co-authored-by: clangenb <clangenb@users.noreply.github.com>
    Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>

commit 59c4053
Author: Egor_P <egor@parity.io>
Date:   Wed Apr 1 20:29:31 2026 +0200

    [Release|CI/CD] Fixes for release flows (#11578)

    This PR backports few fixes for soem release flows, that were made in
    stable2603 branch. In particular:
    - Fixed resume check in Crates Publish flow
    - Fixed missing llvm path on macos builds
    - Fixed scrtipt that reverts path deps in Cargo.toml files
    - Bumped parity-publish version
    - Fixed check if the post-crates-release branch exists in Crateds
    Publish flow

    ---------

    Co-authored-by: BDevParity <bruno.devic@parity.io>

commit e9e4769
Author: DenzelPenzel <15388928+DenzelPenzel@users.noreply.github.com>
Date:   Wed Apr 1 14:11:28 2026 +0100

    Add statement store e2e integration tests (#11237)

    # Description

    #10783

    E2E Integration Tests (zombienet-sdk)

    Functional tests (statement_store)
    - statement_store_genesis_inject — submit + subscribe round-trip with
    genesis-injected allowances, 2-node propagation with data
    integrity verification
    - statement_store_sudo_allowance — sudo-based runtime allowance setting,
    8 concurrent multi-account submissions, 4-node fan-out propagation

    ### Test Infrastructure
    - common.rs — shared helpers: create_test_statement, submit_statement,
    expect_one_statement, expect_statements_unordered,
    subscribe_topic, spawn_network, spawn_network_sudo
    - sc_statement_store::subxt_client — custom subxt config (CustomConfig)
    for non-standard transaction extensions
    (VerifyMultiSignature, RestrictOrigins), set_allowances_via_sudo for
    runtime-configured networks
    - sc_statement_store::test_utils — shared keypair generation and
    allowance storage item builders (get_keypair,
    create_allowance_items, create_uniform_allowance_items)
    - CI matrix registration for all statement store test groups

    ### What we cover in this PR

    Test(statement_store_sudo_allowance) cover the next options for #11534 :
    - Propagation under normal load: Zombienet tests to cover concurrent
    multi-client corner cases and verify statements reach all real nodes,
    NOT including during major sync

    ---------

    Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>

commit a41bb09
Author: Javier Viola <363911+pepoviola@users.noreply.github.com>
Date:   Wed Apr 1 14:34:27 2026 +0200

    [zombienet] typo in preflight (#11602)

    Typo in the preflight logic was merged by accident.
    Thx!

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
github-merge-queue Bot pushed a commit that referenced this pull request Apr 3, 2026
Fix a regression introduced by #11381, where we wrapped the slot-based
collator launch in an async task that first calls `wait_for_aura`, then
spawns the actual long-running collator tasks via `slot_based::run()`.
The wrapper was spawned with `spawn_essential_handle()`.

Essential tasks shut down the node when they complete. The init wrapper
completes immediately after spawning, the TaskManager sees an essential
task exit, and the node shuts down.

This only affects parachain collators started with
`--authoring=slot-based`.

Fix: use `spawn_handle()` for the short-lived init wrapper. The child
tasks inside `slot_based::run()` remain correctly marked as essential.

An easy way to reproduce (same setup used by staking-miner nightly test
- which in fact started to fail after #11381 got merged e.g.
[here](https://github.com/paritytech/polkadot-staking-miner/actions/runs/23928039324/job/69807526676)
): spawn a Zombienet network with a 2-validator relay chain and a single
slot-based parachain collator. The collator process starts but shuts
down immediately.
For example in your SDK repo:
```
cd substrate/frame/staking-async/runtimes/papi-tests
just setup
just run fake-dev 
```
which launches zombienet spawning
  - alice (relay validator, port 9944) — polkadot
  - bob (relay validator, port 9945) — polkadot
- charlie (parachain collator, port 9946) — polkadot-parachain
--collator --authoring=slot-based

Port 9946 never comes up.

I have also verified that the fix coming from #11381 still works,
running manually `./target/release/polkadot-parachain --chain
asset-hub-polkadot --sync warp --authoring=slot-based --tmp -- --sync
warp`.

---------

Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

T9-cumulus This PR/Issue is related to cumulus. T11-documentation This PR/Issue is related to documentation.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[polkadot-assethub] warp sync with --authoring=slot-based panics with AuraApi_slot_duration not found

4 participants